{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 机器学习 - 线性回归\n",
    "\n",
    "本文讨论线性回归，是机器学习中最简单的一种模型。同时我会讲到机器学习中一些常见的概念。\n",
    "\n",
    "## 再谈监督学习\n",
    "\n",
    "在监督学习中，模型需要从样本 $(x_i, y_i)$ 中学习到 $x$ 和 $y$ 的映射关系，即函数函数 $y = f(x)$。\n",
    "\n",
    "### 回归模型\n",
    "\n",
    "当 $y$ 是连续值的时候，监督学习的任务是学习到 $x$ 和 $y$ 的定量关系。如下图所示，$x$ 和 $y$ 的映射关系可以表示为一个一次函数。这种将 $x$ 映射到连续空间的问题就叫做回归。\n",
    "\n",
    "![](https://wangyu-name.oss-cn-hangzhou.aliyuncs.com/superbed/2019/09/25/5d8b02fa451253d178802ae2.jpg)\n",
    "\n",
    "### 分类问题\n",
    "\n",
    "如果 $y$ 的取值范围是离散的集合，比如 {是, 否}、{高级,中级,初级}，我们称这种问题为分类问题。\n",
    "\n",
    "![](https://wangyu-name.oss-cn-hangzhou.aliyuncs.com/superbed/2019/09/24/5d8a226e451253d1785ec030.jpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 线性回归\n",
    "\n",
    "后面我将给出一个例子引出线性回归的定义，并讲解求解线性回归的方法，最后给出线性回归的代码示例。\n",
    "\n",
    "### 简单线性回归\n",
    "\n",
    "![](https://wangyu-name.oss-cn-hangzhou.aliyuncs.com/superbed/2019/09/25/5d8b02fa451253d178802ae2.jpg)\n",
    "\n",
    "上图中给出了一个坐标系，定义横坐标为 $x$ 纵坐标为 $y$，这样以来图中的每个点都可以表示为 $(x, y)$ 的坐标形式。设 $x$ 代表温度，$y$ 代表长度，我们假想有个东西会随温度变化而膨胀或收缩。但是我们并不知道温度和长度之间具体的关系是什么，我们只是在一段时间内在不同的温度下对物体的长度进行了测量。\n",
    "\n",
    "经过测量后，就得到了一大堆 $(x,y)$ 的样本点，这些点散落在坐标系中，很明显，图中 $x$ 和 $y$ 具有某种关系，具体而言，其关系可以用 $y = ax+b$ 这种一次函数来描述。图中的点落在红色直线附近，这可能是因为测量的不准确，或者存在其他方面的误差。\n",
    "\n",
    "这里 $y = ax+b$ 就是一个简单的一元线性回归模型，求出模型参数 $a$ 和 $b$，$x$ 和 $y$ 的关系就唯一确定下来了。在这个例子中，我们就能知道温度和物品长度的大致关系了。\n",
    "\n",
    "\n",
    "### 线性回归模型定义\n",
    "\n",
    "线性回归问题的一般设定是：\n",
    "\n",
    "给定 $n$ 个样本 $(x_i,y_i)$，其中 $x_i$ 是 $m$ 维向量，对应样本的 $m$ 维特征。$x$ 是自变量，$y$ 是应变量。这里 $x_i$ 是向量，可以写为：$(x_i^{(0)}, x_i^{(1)}, ...x_i^{(m)})$，下标是样本的编号，上标为特征编号。\n",
    "\n",
    "线性回归的任务是找到一个函数 $h_w(x)$，代入具体参数 $x_i$ 使 $h_w(x_i)$ 的值尽可能接近 $y_i$，线性回归中 $h_w(x)$ 的定义如下：\n",
    "\n",
    "\n",
    "$$\n",
    "h_w(x) = w^Tx + b\n",
    "$$\n",
    "\n",
    "\n",
    "其中 $w$ 是和 $x$ 具有相同维度的向量，$w^Tx$ 是两个向量的内积。$w$ 和 $b$ 是线性回归模型的参数。\n",
    "\n",
    "之所以称之为线性模型，是因为这里 $y$ 的值实际上就是 $x$ 的各个维度加求和，最后加上一个偏置得来的。其中权重就是 $w$，偏置是 $b$。机器学习算法的就是用来寻找这个 $w$ 和 $b$。\n",
    "\n",
    "## 问题求解\n",
    "\n",
    "$x$ 和 $y$ 之间存在某种函数关系，即 $y = f(x)$，但是这个 $f$ 我们并不知道。我们能做到只能根据样本去估计 $f$。\n",
    "\n",
    "设估计出的函数为 $h$，为了使 $h$ 和尽可能 $f$ 一致，可以最小化下面的函数：\n",
    "\n",
    "$L(w,b) = E((f(x)-h(x))^2)$\n",
    "\n",
    "即 $h$ 和 $f$ 的误差的平方的期望值，再定义域上各处，$f$ 和 $h$ 的值相差越小，这里期望值越小。这里我们将 $L$ 称为损失函数。若机器学习模型找出的函数 $h$ 和实际函数 $f$ 存在差异，我们就称该模型存在风险，即带来误差的风险，这里的 L 就称为期望风险，因为它是风险的期望值。\n",
    "\n",
    "如果 $f$ 和 $h$ 的定义给出来了，那么我们在其值域上做个积分就能求出这里的期望风险，但是 $f$ 是不知道的，但是有一点可以确定，样本 $(x,y)$ 中的 $y = f(x)$，我们可以求出所有样本的损失的均值，以此来代替期望风险。\n",
    "\n",
    "于是我们只需要最小化下面的函数即可：\n",
    "\n",
    "$L(w,b) = \\frac{1}{n}\\sum_{i=1}^n (y_i - h(x_i))^2$\n",
    "\n",
    "下图中直线是估计出的 $h(x)=wx+b$，样本点距离直线的距离的平方的均值越小，我们就认为 $h$ 越贴近那个潜在的 $f$。\n",
    "\n",
    "![](https://wangyu-name.oss-cn-hangzhou.aliyuncs.com/superbed/5d8b5e88451253d178924cc1.jpg)\n",
    "\n",
    "\n",
    "有了上述定义，我们需要做的就是找出 $w$ 和 $b$ 让 $L$ 最小，这一步通常使用梯度下降法来完成。\n",
    "\n",
    "\n",
    "## 梯度下降\n",
    "\n",
    "![](https://wangyu-name.oss-cn-hangzhou.aliyuncs.com/superbed/5d8b6490451253d178946039.jpg)\n",
    "\n",
    "可以想象自己在下山，为了快速下到山脚下，我们每一步都向最陡峭的方向跨一步，然后再寻找最陡峭的方向，继续向下走，如果这个山不存在山坳（凹下去的坑），那么我们很快就能下到山脚下。\n",
    "\n",
    "梯度下降法就是这个思路，只不过下的不是山，而是函数值，最陡峭的方向值梯度值。\n",
    "\n",
    "回到上面的例子中：\n",
    "\n",
    "$$\n",
    "L(w,b) = \\frac{1}{n}\\sum_{i=1}^n (y_i - h(x_i))^2 =  \\frac{1}{n}\\sum_{i=1}^n (y_i - (wx_i+b))^2\n",
    "$$\n",
    "\n",
    "可以很轻松地对 $w$ 和 $b$ 求导数，然后带入 $x$ 和 $y$ 求出 $w$ 和 $b$ 的导数，然后更新 $w$ 和 $b$:\n",
    "\n",
    "$$\n",
    "w_i = w_i - α{∂ \\over ∂w_i}L(w,b)\n",
    "$$\n",
    "\n",
    "$$\n",
    "b = b - α{∂ \\over ∂b}L(w, b)\n",
    "$$\n",
    "\n",
    "这里的 $\\alpha$ 是每一次更新的大小，即对应你下山的时候跨出的步伐大小。\n",
    "\n",
    "梯度下降法可以分为下面几种：\n",
    "\n",
    "### 1. 批梯度下降\n",
    "\n",
    "注意到，前面求导的式子中，$L$ 可是在所有样本上计算出来的，所以求导的这一步也需要在所有样本上计算。\n",
    "\n",
    "在全部样本上，计算能使整体误差变化最大的方向，然后用这个方向来更新参数。多次计算梯度并更新参数，最终会收敛到局部极小值。\n",
    "\n",
    "但当样本量很大的时候，计算所有样本的梯度会是一个相当耗时的操作。对于样本量很大的情况，此方法不适用。\n",
    "\n",
    "因为 $L$ 包含所有样本误差之和，更新 $h$ 的参数可以让部分样本点的误差将为 0，但同时可能增加其他样本的误差。使用全部样本计算梯度，是让整体的误差变小。\n",
    "\n",
    "### 2. 随机梯度下降\n",
    "\n",
    "如果最终得到的模型，让整体误差最小，那么对单个样本误差的误差也应该很小，因此只需要不断地让某一样本的误差变小，那么整体误差就会变小。\n",
    "\n",
    "使用随机梯度下降法，每次迭代只是考虑让单个样本点的误差变小，而不管其他点。这样重复多次，实际上就可以让整体的误差变小。\n",
    "\n",
    "\n",
    "### 3. Mini-batch 梯度下降\n",
    "\n",
    "样本中可能存在噪声，使用随机梯度下降可能导致计算出的结果震荡严重，因为毕竟不是每个样本都与全局最佳模型完全吻合。批梯度下降用了全部样本求出梯度均值，得出的是全局误差梯度最大方向。而随机梯度下降，只用了单个样本，求的是单个样本的误差梯度最大方向。Mini-batch 梯度下降，采用一小部分样本，比如 100 个样本，求这一小部分样本误差梯度最大的方向。这样既能得到很快的计算速度，同时也可避免严重的震荡。\n",
    "\n",
    "在成熟的机器学习库中，计算梯度这样的操作常常是借助 GPU 来完成的，计算过程是矩阵化的操作，因此在单个样本或是一批样本上计算梯度，对 GPU 而言几乎是相同的。\n",
    "\n",
    "![](https://wangyu-name.oss-cn-hangzhou.aliyuncs.com/superbed/5d8b6a78451253d17896106f.jpg)\n",
    "\n",
    "对于只有两个参数的线性模型，采用梯度下降更新其参数，上图展现了三种方法中两个参数的变换趋势。可以看到批梯度下降的路径非常平滑，随机梯度下降的波动很剧烈，而 Mini-batch 梯度下降的波动居于两者之间。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "\n",
    "## 过拟合与欠拟合\n",
    "\n",
    "![](https://wangyu-name.oss-cn-hangzhou.aliyuncs.com/superbed/5d8b6acb451253d178962738.jpg)\n",
    "\n",
    "上图中，红色曲线是 $h$ 绿色曲线是 $f$，即 $h$ 是机器学习学到的模型，$f$ 是实际产生样本的函数（真实的模型，只有上帝知道的模型）。我们希望学到的模型最贴合实际。\n",
    "\n",
    "第一幅图中，学到的模型和真实模型差距很大，且对样本拟合的也很差，我们称当前的模型欠拟合。\n",
    "\n",
    "**拟合**： 用一个函数去逼近样本的分布，可以说机器学习就是一个拟合样本的过程。\n",
    "\n",
    "最后一幅图，学习到的模型与所有样本完全吻合，每一一丝的误差。但是机器学习的目的不在于拟合样本，而是尽可能逼近真实的模型。我们称第四幅图过拟合。\n",
    "\n",
    "可见，欠拟合与过拟合都是我们想要极力避免的。欠拟合通常是是因为模型过于简单，比如图一的模型中可能只有一个参数，即截距 $b$。而图四中的模型过于复杂，拟合能力过强。\n",
    "\n",
    "## 学习速率\n",
    "\n",
    "下面式子中的 $\\alpha$ 称为学习速率，可以看到这个值越大 $w_i$ 变化的也就越快。\n",
    "\n",
    "$$\n",
    "w_i = w_i - α{∂ \\over ∂w_i}L(w,b)\n",
    "$$\n",
    "\n",
    "正常情况下，每次像梯度的反方向走一步，误差都会变小。但是如果学习速率选择的过小，迈出的这一步就会很小，收敛速度会变慢。过大则可能越过了最佳点，导致在最优解附近震荡。\n",
    "\n",
    "![](https://wangyu-name.oss-cn-hangzhou.aliyuncs.com/superbed/2019/09/26/learning_rate.gif)\n",
    "\n",
    "合适的学习速率，可以使得误差稳步减小，且减小幅度慢慢变少，最终能较为迅速地达到最佳点。\n",
    "\n",
    "\n",
    "## 正则化\n",
    "\n",
    "![](https://wangyu-name.oss-cn-hangzhou.aliyuncs.com/superbed/5d8b6acb451253d178962738.jpg)\n",
    "\n",
    "前面提到，实际的映射关系 $y=f(x)$ 是不知道的，所以最小化期望误差，或期望风险是不可行的，于是我们最小化样本的误差。上图中，红色曲线是 $h$ 绿色曲线是 $f$，这里估计出的模型 $h(x)=wx+b$ 确实做到了最小化样本误差，它与所有样本完全吻合，甚至没有误差。但是我们知道，这不是期望得到的模型。我们的模型应该反映整体的趋势，而不是完全拟合样本。\n",
    "\n",
    "严格的线性模型往往是直线或者平面，为了得到曲线可以在模型从引入自变量的幂，构成一个多项式模型：\n",
    "\n",
    "![](https://wangyu-name.oss-cn-hangzhou.aliyuncs.com/superbed/5d8b7231451253d178984200.jpg)\n",
    "\n",
    "多项式模型能够表达一维线性模型能表达的一切，能够拟合更加复杂的关系，但自然界中的大多数关系都是比较简单的，所谓大道至简，机器学习模型也是这样，在能够大致拟合样本的前提下，我们希望模型越简单越好。\n",
    "\n",
    "观察出现过拟合的图四，其中曲线的变化非常剧烈，为了完全拟合个别点，在某些位置曲线剧烈的上升和下降。而我们希望得到的是一个平滑的曲线，比如图三那样。因为自然界倾向于简单，我们不知道的那个真实模型 $f$ 也往往不会很复杂。\n",
    "\n",
    "变化剧烈的函数存在一个特点，即在某些点斜率很大，对应到线性模型中，就是某个 $w_i$ 的值会很大，前面四幅小图是用不同幂次的多项式模型进行拟合的。下表列出了各个模型的参数 $w$ 的取值。\n",
    "\n",
    "![](https://wangyu-name.oss-cn-hangzhou.aliyuncs.com/superbed/5d8b7352451253d178989e33.jpg)\n",
    "\n",
    "为了让曲线保持平滑，让曲线避免剧烈的波动，就需要控制系数，让系数不能变的过大。为了让系数保持适当大小，做法就是添加正则化项，把系数加入到损失函数中，这样以来系数变大，损失函数的值也就变大。\n",
    "\n",
    "\n",
    "$$\n",
    "L(w,b) = \\frac{1}{n}\\sum_{i=1}^n (y_i - h(x_i))^2 + \\lambda {\\Vert w \\Vert}^2\n",
    "$$\n",
    "\n",
    "这里 $\\lambda {\\Vert w \\Vert}^2$ 就是正则化项，当模型很复杂的时候，这一项的值就会变大，损失函数的值也就会变大，加入着一些，可以保证模型复杂度和误差之间保持协调。\n",
    "\n",
    "上面式子中的 λ 能够控制正则化项对误差的影响，当 λ 很小的时候，w 的大小对整个误差构不成太大影响，这个时候正则化项就没起到什么作用。因此调整这个值的大小可以控制正则化的作用强度。\n",
    "\n",
    "\n",
    "常见的正则化策略有：\n",
    "\n",
    "- L2 正则化\n",
    "- L1 正则化\n",
    "\n",
    "### L2 正则化\n",
    "\n",
    "L2 正则化项是系数的平方和：\n",
    "\n",
    "\n",
    "![](https://wangyu-name.oss-cn-hangzhou.aliyuncs.com/superbed/5d8b75ef451253d178995a88.jpg)\n",
    "\n",
    "_从李宏毅老师的 PPT 中截的图_\n",
    "\n",
    "求导后，正则化项中还包含一个 λw：\n",
    "\n",
    "![](https://wangyu-name.oss-cn-hangzhou.aliyuncs.com/superbed/5d8b7662451253d1789981c9.jpg)\n",
    "\n",
    "因此 $w$ 每次更新都会减小一定得比例。最终，个别不重要的项的系数会变成一个很小的小数，但不为零。采用 L2 正则化得出的模型的系数都比较小。\n",
    "\n",
    "### L1 正则化\n",
    "\n",
    "L1 正则化的正则化项是系数的绝对值求和：\n",
    "\n",
    "![](https://wangyu-name.oss-cn-hangzhou.aliyuncs.com/superbed/5d8b76e2451253d17899ac72.jpg)\n",
    "\n",
    "每次更新，正则化想会导致 $w$ 被减掉或者加上 $\\eta \\lambda$（取决于 w 的符号）。\n",
    "\n",
    "### 关于稀疏解\n",
    "\n",
    "模型的输入 $x$ 可能包含很多特征，但是究竟那些特征才是有用的特征呢？我们希望模型的参数给出说明，当 $w_i$ 较小的时候，我们认为 x 的第 $i$ 维特征不重要。如果 $w_i$ 干脆为 0，那么可以说第 $i$ 维特征就完全没用，可以完全去除这一特征，精简模型。\n",
    "\n",
    "提到正则化，人们必然会讨论 L1 正则更可能得到稀疏解，即 w 的某些维度会取值为 0，下面来进行解释：\n",
    "\n",
    "**图形化解释**\n",
    "\n",
    "![](https://wangyu-name.oss-cn-hangzhou.aliyuncs.com/superbed/5d8b79f8451253d1789a8d55.jpg)\n",
    "\n",
    "假设参数 $w$ 是两维的，L2 正则项的等值线是以原点为中心的一个个圈圈，上图中彩色圈圈可以视为样本误差的等值线。随着 $w$ 的变化，这两项的取值的会变化的。整体 $L$ 的取值是两个等值线在某个交点的值求和，L2 正则化中等值线的交点往往落在某个象限中。\n",
    "\n",
    "\n",
    "![](https://wangyu-name.oss-cn-hangzhou.aliyuncs.com/superbed/5d8b7ad3451253d1789ace77.jpg)\n",
    "\n",
    "L1 正则化项的等值线是一个个矩形，和误差项等值线交点更有可能落在坐标轴上。\n",
    "\n",
    "**数学解释**\n",
    "\n",
    "**TODO**\n",
    "\n",
    "## 数据归一化\n",
    "\n",
    "各变量的数量级不同，有的超过几千，有的绝对值则不大于 1。这时 Cost Function 的等高图看起来像是下图中左边的样子：\n",
    "\n",
    "![](https://wangyu-name.oss-cn-hangzhou.aliyuncs.com/superbed/2019/09/26/5d8cb7ae451253d178ee38fe.jpg)\n",
    "\n",
    "梯度下降要进行很多次迭代，因此需要对变量进行归一化处理，让所有变量都落在相近的范围内，可以加快找到极小值点的速度。\n",
    "\n",
    "具体的操作可以是，对所有样本的各个属性，找到这个属性的均值，以及标准差。对各个属性减去均值并除以标准差。这样以来样本的各个维度的均值为 0，方差为 1。\n",
    "\n",
    "![](https://wangyu-name.oss-cn-hangzhou.aliyuncs.com/superbed/2019/09/26/5d8cb76e451253d178ee276b.jpg)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 线性回归实践\n",
    "\n",
    "### 一元线性回归"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这里我人为设定真实的模型为 $y = 4x+2$，另外加上均值为 0，标准差为 4 的高斯噪声，然后使用此模型生成一些数据，使用线性回归模型来进行拟合，看看拟合效果。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.collections.PathCollection at 0x7fbfc6caa940>"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD8CAYAAABn919SAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJzt3X+MndV5J/DvM+NLfIekHrNMsnCNY3eD7IZ4sZdZyq6lVWxaSJcAlhNCEhq5KyT/s90NNOvNUK3WpurKU3k3oVKjVFaTLVVZMAl0cEJaJ8KOqqKFzUzGLnHBCgsBfGFjt3hoYk/wnZln/7j3Hd957znve977/n7n+5Gimblzf5w7Js977nOe8xxRVRARUfkN5D0AIiJKBgM6EVFFMKATEVUEAzoRUUUwoBMRVQQDOhFRRTCgExFVBAM6EVFFMKATEVXEiixf7Morr9R169Zl+ZJERKU3NTX196o6Ena/TAP6unXrMDk5meVLEhGVnoi85nI/plyIiCqCAZ2IqCIY0ImIKoIBnYioIhjQiYgqwqnKRUR+AuBnAOYBzKnqqIhcAeAQgHUAfgLgU6p6Lp1hEhEVw8R0EweOnMKbM7O4eriOPbduwI4tjbyHBSDaDH2bqm5W1dHOz2MAnlHVawE80/mZiKiyJqabeODJF9CcmYUCaM7M4oEnX8DEdDPvoQGIl3K5E8DDne8fBrAj/nCIiIrrwJFTmG3NL7lttjWPA0dO5TSipVwDugL4rohMicjuzm0fUNW3AKDz9f1pDJCIqCjenJmNdHvWXHeKblXVN0Xk/QC+JyIvub5A5wKwGwDWrl3bxxCJiIrh6uE6mobgffVwPYfR9HKaoavqm52vZwD8BYAbAfxURK4CgM7XM5bHHlTVUVUdHRkJbUVARFRYe27dgHptcMlt9dog9ty6IacRLRUa0EXkchF5n/c9gFsA/AjAYQC7OnfbBeCptAZJRFQEO7Y0sH/nJjSG6xAAjeE69u/cVJgqF5eUywcA/IWIePf/X6r6VyLyAwCPi8i9AF4HcFd6wyQiKoYdWxqFCeB+oQFdVV8BcL3h9n8AcHMagyIioui4U5SIqCIY0ImIKiLTAy6IiKqiiC0AGNCJiCLyWgB4u0a9FgAAcg3qTLkQEUVU1BYADOhERBEVtQUAAzoRUUS2rf55twBgQCciiqioLQC4KEpEFJG38MkqFyKiCihiCwAGdCKiGIpUj86ATkTUp6LVozOgE9GykMZMOqgenQGdiCgFac2ki1aPzrJFIqq8tHZ2Fq0enQGdiCovrZl00erRmXIhokJwyXH3mwdP63DnotWjM6ATUSLiLDq65Ljj5MH33LphyWOB5GbSRapHZ0AnotjiLjq6VIuE5cGDLiZFm0mnhQGdiGKLW77nkuO23ce7eIRdTIo0k04LF0WJKLagYLt1/CjWjz2NreNHMTHdNN7PpVrEdp9BkUL2Js8DZ+hEFJtt0VGAxduD0jAuOW7bffzB3JNkLbi3PtCcmcWgCOZV0Shg2oYzdCKKzVS+JwDUdz/bzHnHlgb279yExnAdAqAxXMf+nZt6Uiam+zRSrgX31ge8C9O8tt+Vd4Hq/tQxMd10+kSSFs7QiSg206KjacYO2GfOLjlu233SqmABzOsDnu51giL0dWFAJ6JE+IPt1vGjqdR+m14XAPYdPomZ2RYAYGUtueRDWOrG+71tYXjf4ZOZBXSmXIgoFVnvonx3bmHx+3MXWj3pkH6FXYC839sC/8xsK7PUCwM6EaXCJS+elLR6tQDmC5On+wIVFPizqrhxTrmIyCCASQBNVf24iKwH8BiAKwD8EMDnVPViOsMkojLKqvY7za6H3esDQVUue27dgPsOHU9tHC6i5NA/D+BFAL/U+fkPAHxZVR8TkT8GcC+AryY8PiKiUHF6tbi0LHBdsH3wWydx7kKrr3EkwSmgi8gaALcB+G8AfkdEBMB2AJ/t3OVhAPvAgE5EOQirY+8O2qvqNYgAMxdaWFWv4fzFObTml5YiAtEqU7znP3eh1VOumWX3RVH1V4oa7iTyTQD7AbwPwH8C8FsAnlPVD3V+fw2Av1TVjwQ9z+joqE5OTsYdMxFRz8x628YRHHvp7JKfv33ircXKlygaw3U8O7bd+Dr+Gby/XBG4VIM/3HXxiNM/RkSmVHU07H6hM3QR+TiAM6o6JSIf7Rqvn/HKICK7AewGgLVr14a9HBFRKFPN9xNTzcVF14npJvZ84wRaC+ETVhMv5+1SW25akPWC+btzC5nWpbukXLYCuENE/i2AlWjn0B8CMCwiK1R1DsAaAG+aHqyqBwEcBNoz9ERGTUSlFvd8z7Cqli88fmJxR2c/vJy3S9OxoHJFv7TPGw0tW1TVB1R1jaquA/BpAEdV9R4AxwB8snO3XQCeSmWERFQp3VvpFeYt9GHCOi/GCebdOW+X6pmoC55pVrzEqUP/ItoLpC8D+CcAvpbMkIioypKoGY/SeTFMbUCweqi25PEHjpzCxHTTqQukbQOV95yuY09CpICuqt9X1Y93vn9FVW9U1Q+p6l2q+m46QySiKkmiZtwWRF1m5kO1Aaweqi1udjpw1/XYe/t1Sx7fnJnFnm+cwLnzvWHNX7Vi20DlPWfQY5PGXi5ElKkkzve0nUDkbf4xWT1Uw97brzPmr7eOH+2Z2bcWtGdR1fYcQXXqWZ6SxIBORJlK6nzPKJ0Xw1oOuH46GLpsRaSAnPUpSQzoRJSpNM/37Pe5g9r9dstqC3+/nDYWJYUbi4iKLW45YVmZNgeZdG84ylJiG4uIaHkowgENNmldaPwtAVbWBoy9WID2bsqstvD3i+1ziQhAui1o40iibt3leWdmW/hFa8F6f0X+F7YwDOhEBACRj4zLSloXGtvzDoqpswmsZ5cWCQM6EWFiumls0ARk1/rVJq1e57bHm2rZs+yYGAcDOhHhwJFTxu56AmDbxpFcT7K3XVAUiDUe1wvV6qFaaictJY0BnYiss1UF8MRUM3b+emK62fdFIegIuDj59KDn7Ra19jxPrHIhImsdtqk3Snf+2qXyJG71jP8IOL9+Oxj6a9ZtBdx5ryFEwRk6EUXujeIF5e6Z+/2HjmOdYQZuW3zcd/ik8/h2bGng2bHt1jx/1KDrfWK4v3MG6Jfv3mxd9Mx7DSEKBnQisjaYsgU508zdC/3+NEhQv/CgVIkpTePS/TCMrQxy28aRzJtpJY0BnYgAXJoFvzp+G54d244dWxp9dzXsTssEBVtb6WGaQdf2ieHYS2eNF7Wy5M8B5tCJKEA/XQ09zZlZrB97Gqvq5r7ggH32HhZ04+waDSqDTLKZVh5tFBjQiSiQa1dDE28Hpo1t9p5m0E2ifW+YvNooMOVClLM4JX156c65A+ZT4/389wlKlSSRK7expZGSzJXn1UaBM3SiHBW5IVaY7plyd3rBll1XtPPSLikI157p/aQ10mzf60lrd2sYBnSiHLmcKl+GlrbdwX3r+FFjSiNK61mXoBvnYpj2wRNZpHVMGNCJchQ0k5uYbmLf4ZNLctB5z+BdLi5RTySyPWdY0HW5GOYlqVOZomJAJ8qRbSa3ql6zLjrmFbRcZ8RRUhq255x87W0ce+ls4OPzSmu4yCKtY8KATpQj20xOBIEVJFkHrYnpJr7w+Ime+nNvx2eUQ5O72WbZjzz3es9GJe95PXmlNVxlfZ4owCoXolzZdmjOWE7N8WQRtLzqm3VjT+P+Q8etm4nCdnwGCWoK1s1UIZJFtUrZcIZOlDPTTC5o404WQcufCgk7ebjfFJDr4cxAe6a+dfzokhy799pFXjDOEgM6UQGZUjFAuzf33tuv6ytoRamWMaVCgvSbAjK9T4H9AuJPv+SR1igyBnSignrPioHFQBcnkAPBC5pA7yw3aoDuNwVkmmVv2ziCJ6aa1gtKUSpZiogBnSghSdWL+4MvgMDDi10EtbB9d26hJ9APD9VwLiSP74mbAjLNskc/eEVg2qkIlSxFFLooKiIrReT/iMgJETkpIg92bl8vIs+LyI9F5JCIXJb+cImKKamT6b1qkqS3jQe1sDW91i9a8z1b9b2fh+s1rB6qpdqR0Ov8WIUe5VlymaG/C2C7qv5cRGoA/kZE/hLA7wD4sqo+JiJ/DOBeAF9NcaxEhZXEJhfvomCrJgmalXZ/OlhVr0EEmLnQWvykEGXxsT32pZ8IBMA9N63F7+/Y5PwcSchrg05ZhQZ0VVUAP+/8WOv8TwFsB/DZzu0PA9gHBnRappLY5BK2EHn1cN2Y1gGWdj407Sz9xA2NwLx0GAVw7KWzfT02DlayROOUQxeRQQBTAD4E4CsA/i+AGVWd69zlNAD+hWnZSmKTS1Dwr9cGsW3jiHFhc2VtIDBQ+/uIR5mpu44vTaxkcee0sUhV51V1M4A1AG4E8Cumu5keKyK7RWRSRCbPns3+Ck+UhSQ2udiC/6AI9u/chGMvnTWmdVwWL70+4kHncnrEcgfmrYsv0k5RVZ0B8H0ANwEYFhFvhr8GwJuWxxxU1VFVHR0ZGYkzVqLCsu34jDKztF0U/senrseOLY1YM+TuYBwUmOu1Qdzzq2tz24FZxt7wRRKachGREQAtVZ0RkTqAXwPwBwCOAfgkgMcA7ALwVJoDJXKVV7vZuKmBsHxx0MJm0GYcfzB22bTklQ1m+Tcsc2/4ohANOexVRP452oueg2jP6B9X1d8TkV9GO5hfAWAawG+q6rtBzzU6OqqTk5OJDJzIxFTDXa8Nlu6wX5OJ6SbuO3Tc+nvv8AhTlYv/vad50ev3uZPoo15VIjKlqqNh93OpcvlbAFsMt7+Cdj6dqDCilA+W4eCIbju2NPDgt04ac+aDIpHGn9ZCo8ss2/Z3L3I73LJgt0WqFNegkNRGoKztvf26nvw2AMyrFmL8YWdpBv3d0zxHdLng1n+qFNfywTRPuwnb5OM/Ri3KpwTvd7be5F94/ATuP3Q8t08cYRdU29/9vkPHMVyvoTYoaM1fel/cRBQNAzpVimnBrzYguHBxDuvHng5tPhX3470/5WA7Pg5A38fL7djSwP2WXLoX5KOc/JOksAtq0N93ZraF2oBg9VAtMP9PdgzoVCn+SpFV9RrOX5xbzDuHNZ+K+/E+bLfnbGseDzz5twAk1vFyLlv5XU/+SVLYVv2wcbcWFEOXrcD0f70llfFVHXPoVDneBppXx2/D5e9ZseQjPNAOdKpIpdbaZYY/21oIDPouz2GqWTdxOfknSWH1+C7j5iJo/zhDp0qzBYd3Zlv48t2bjX1Rto4f7TtFEbUJlu05wvg/iQyIWJt6+Xkn/6SVhgmqoOket+3vxEXQ/jGgU6UF5XT9gcd1Y0vQQqZt046rKJ8Susdvqr+3bTaSznsD8tm8443btmeAi6D9Y8qFKi1Kj5WwkjsgvNzRn3Kw9UUxWT1U63sDlCnVcc9NvVv4TUE+7TSMTRLtEmgpztCp0qK0X3WpfHEpdwybOZs8dPfm2IEs6OQf770X7QQgdlJMFgM6VZ5r0HCpYY9a7uiSM2500j9p8L932/Z65q2rgSkXog6X9Ew/uxm9qpuH7t4c+PxZdBpMos0vFRcDOlGHS043bkB8z4pL/5frzpln1YqAeetqY8qFqEtYeqbfI9FMufRfdJ3bmWYrAr8k89Zla3BWdQzoRBH1ExDDAnYZOw2yf3nxMKATpcA/cw2rLkniTNKsZfmpgtwwoFOpeYGzOTOLwc5uyUbOBzaYZq62TT5ewA7rgZLU2JJUxk8VVceATqXlD5z+ToNA/I/+/aQVTDNXRe+mnu6A3U9uPu+URxk/VVQdAzqVVlBnw7CP/q4z237SCrYZquLSMXGm14yamw/b2Zr2zL3fTxWUHgZ0KpQoKYSwj/a230eZ2dqew2twZRqfbeaa9NmYQWPLYubeb8UPpYcBnXLjD97bNo7giammcyAK62xo++gfZdYd9Bq28WU1c7WNbVB6e62XoQSS4uPGIsqFaSPNI8+9Htocq1tQb+2gABplMS+sf7dpfFlt3rFtcrK10eViZfVxhk65ePBbJ40LhyaufVK6Fx1X1uxzlSiLeS69WJozs0uOt/NmrWnPXG0pD9tYuVhZfQzolLmJ6abx+DebsD4ppt7a5y60etIh3SWOQRUnttewNbZC57ny7C3ux8XK5YkBnZwkWe8c1Hs7SqDtHpPpxJ7uvLE/4HeXEdrq1sNy/CZF2FjDxcrliwGdQiVd7xyUy73nprVOp9TbatBtr2WrDbdVnpje8xNTTXzihsbi+KKmiLLExcrliQGdQiW9xduWwx6u1/D7Ozb1PSbbawHRdzXa3vOxl84uXgDYW5yKhlUuFCrpLd626ox9d1wXe0z+5/TSNVH7mLu8Z/YWp6LhDJ1C9bPFOyjnnkSON6gGe0HV6fDmoODr8p6zzFWzTS25ELXkHhfvIHINgD8D8E8BLAA4qKp/KCJXADgEYB2AnwD4lKqeC3qu0dFRnZycTGDYlCXb6ey22uqo989iTN5jXINiFu/BVdyx8GJQfiIypaqjofdzCOhXAbhKVX8oIu8DMAVgB4DfAvC2qo6LyBiA1ar6xaDnYkAvryhBwZZbTnrre9qBqvv5V9VrEAFmLrQyD4px/p5FujBR/1wDemjKRVXfAvBW5/uficiLABoA7gTw0c7dHgbwfQCBAZ3KK0rVRNSce7+BOe1KDluNe9b15nHWMNizfHmJtCgqIusAbAHwPIAPdIK9F/Tfb3nMbhGZFJHJs2fPxhstlUKUBch+ztLM4jDlbmFdDdPWz8HUHvYsX16cA7qIvBfAEwDuU9V/dH2cqh5U1VFVHR0ZGelnjFQyUao/ogZL1wuAF/TXjT2Nf/bAd7AuRvDPOyjGqaaJczGg8nGqchGRGtrB/BFVfbJz809F5CpVfauTZz+T1iCpXKJUf7gEy4npJvYdPomZWXO7AH8KIerBF2Epn7wPcohTTcOe5ctLaEAXEQHwNQAvquqXun51GMAuAOOdr0+lMkIqJdf8dliwnJhuYs83TqC1ELx4330BiHLwhUt+3BYUt20cwdbxo5lUj/S7XsA2AMuLywx9K4DPAXhBRI53bvtdtAP54yJyL4DXAdyVzhCpysJmkAeOnAoN5sDS2XKUgy9cFg1NQTFq7/Y8sQ3A8uFS5fI3aPcxMrk52eFQEopcd2wa2/6dm6zjjbojFIh28IVrftwfFLeOH2X1CBUOd4pWTN4ldkFsY9u/c5O1njosOJs6JZpm/R7X4B+WH897oZTIhL1cSiiobC/tErs4JYP9jG3PrRtQG+j9gFgbFDx092Y8O7a950LVfWIQ0G4HAJhPDrJVkHj5cdv7ZPUIFRFn6CUTNgNPc+YYd/bfz9i85+2uclk9VMPe268LfE3XvLH/RCLvPM5Hnnt9sT1ulIVSVo9QnhjQSyZsEc+WQhgQ6TkmLenXDtNveiOLHaHA0lN+/MuwLgulRVqroOWJAT0hWSxETkw3rflkb5Zryx+H1WK7iDv7L/Ks1qW/ethCKVHeGNATkMVCpPcaNt4s1z9zDDuaLew1uy9Sw0M141mg3TXjQRe1oFmt6bg3l5OLkuJyUWJ+nIqOAT0BWTRACppB+me53TPH9WNPGx8TFsBMF6nagKA2KGjNX7pAeK/telEzzWpNj/3z515f/H33c3l/i6QDfVg1TVE+SRAFYZVLAmzBsTkzm1jzqKAAvH9n+9g2U1VGv9UYpgtIa0Fx+WUr0BiuQ7C0aiROdY1LumO2NY99h09GbuTlylTt4tXWmKpjiIqIM/QEBM3uoqZfbGkLW7rDK82zzY77zVvbLiDvzLZwfO8tzvd3SWW45uBNvVyS+iTERU6qAgb0BARtZAGi5axNgXnytbfx81/M9dy/NijYc+uGwNmxt2EnaqCKWpFiu/+qei2030lYuiNMUpt5uMhJZceUSwL8G1lM4hxG8Ojzbxj7mVx+2Qqn2vMdWxp4dmw7Xh2/zbgRxyRqy1bT/WsDgvMX50JTJKbHRsHFSqI2BvSEeEHTFtTjHEbgr1LxvNNJQaSxa7H7IuXPl7ve/70rVyxZQAXMeXXTY10VZbEy60M3iEyYcklYnFrroJPsTUH96uE6JqabOP9ubzomiUAXNQXhv3+UChtT8yvT32K4XsPl71lRqDx3kfvn0PLCgJ6wNA4j+MQNjSWtWr3bt20cMebuXbbG+6WxMSrOwRC2v8W+O6K9ryzw3E4qCgb0FKRxGMHoB6/oud1W7jfUya27SmuGaQrKAmDbxvCjCMtUdcLOi1QUopb8bBpGR0d1cnIys9frV5rb+JN87vVjT/f0HPGY2sra2NIbjeG6ta2tq/8y8cKSRldAe6adV113Gv+2af79iABARKZUdTTsfpyh+6SZD4363P2edRl13EnPMLvHPSAS2ugqK1l+EinKYi0tLwzoPmnmQ6M8d79nXfqfe9/hk6G9U0z9XoDwXLfpggMs3eRkq9DJIx2R1r9tmdJDVG0M6D5p5kOjPHfUsy5tM/WZ2dbiDsvujUrdi6ymoBs2w7RdcFbWBkK38QP51I6n+W/LTUlUBAzoPnEqM+I8t3+2G9Ym1+MFElse18/bqGQK4oMiWFB1mmHaLjguwTyvdESa/7ZERcCNRT5Rd0gm8dxe+WH3jkrbqdy24BNlt6UtDbKg6rSbNKgvu82giNMGpTSl+W9LVAScofu45kNdqyX89/vEDY2ePt+m2a6iXeLnrw6xBR9/+sX/2G5BG5XChPVlh2XcRehWyFw3VV3hyxbjlpmlUabmzx8D5qDler+g8sPhei3SWZqAvYzOe33bRiVT0PX//S5cnDN2ffTzgnqU8kkiMqtE2WLcMrOwx/cb7F2rJVzvZ8vtDtdreHduYfHncxdaTu8/rHe6baOSS6WNKy+Ysw6bKDuFDuhxy8zCDl3o92LhWi3hej/bjsp++3/bLhCN4fqSChnbc3gXujgtbQHulCTKWqEXReOWmQU93uWEHVsHPdfuhq7387ffDcp/e+MPEmfxz5uVxw3mQO/7ZEdConQVOqDHbQsb9Piwi0V3YPP38nYNmFECa3f73bBVjbD3H7X1bTeX4+CG67XQFrf+9xn09ySiZIQGdBH5uoicEZEfdd12hYh8T0R+3Pm6Oo3BxS0zC3p82MUiLN3TPaMeFFn8XXeA6iewhs2+Xd9/P4dauL7+vjuuw7Nj262llQB63mecM0eJyI1LDv1PAfwRgD/rum0MwDOqOi4iY52fv5j04OKWmYU9Pqj/hsspQP7nMOXho+4gDNpUlETFSNhCsO3sUtPru+TqPexISJS+0ICuqn8tIut8N98J4KOd7x8G8H2kENCB+FuqbY8PC/YuuwrT6A1ia/SURB23S9WP7ezSA5+8vuf1ozSl4i5NovQ51aF3Avq3VfUjnZ9nVHW46/fnVNWYdhGR3QB2A8DatWtveO211xIYdvpcasiTal9reu1+Ni2Fvd6W3/uucfbtnQJk+2QwXK/h+N5bYo81rQsVUdUVpg5dVQ8COAi0Nxal/XpJcUn3JNW+1vTaYY/ppxWvLZXS3cDL5J2A37l+guIuTaL09RvQfyoiV6nqWyJyFYAzSQ6qKMKClUv72rT6fkdN98RZfEwqLcKOhETp6rds8TCAXZ3vdwF4KpnhlIu/2sUkrUW/qIuM/Y6DzauIysOlbPFRAP8bwAYROS0i9wIYB/DrIvJjAL/e+XlZ6q4fN0lr0S9qjb7tdgmoPcyzMyIRRRca0FX1M6p6larWVHWNqn5NVf9BVW9W1Ws7X9/OYrBFlnVr1qivZ7p/bUBQX9H7n0C9NoiH7t4cqX6diPJX6F4uZZL1ol/U1/Pff1W9hvMX53ChtbDkfq4dHYmoeArfPreI0mjJmzWeVE9UHoUpW6yatE6Ozxp3bhJVT6GbcxVRVXqSxG18RkTFw4AeUVVmtjxfk6h6mHKJKI2eJHnk5Llzk6h6GNAjitKQykWeOXnu3CSqltIE9Ciz2DRnvEnPbNPo2EhEy1MpAnqUWWwWM94kZ7ZVyckTUf5KsSgapbKkbFUorDYhoqSUIqBHmcX2M+PN8/BiVpsQUVJKkXKxVZasqtcALM2ZD4hg3rD71Zvx+vPr2zaO4ImpZm4bhVhtQkRJKcXW/4npJvZ84wRaC0vHWhsU3P0vr1kSkE28k3GA3nNEBTCeOmTaAl+FLf9EVD6uW/9LkXLZsaWB967s/TDRmlc8+vwbxmA+KALB0hawpvy67XLmT9F4i63NmVkoLs3ks0zPEBEFKUXKBQBmLMenmdIrALCgilfHb1tyW5TKEX+KxpTyYXkhERVJaQK6LY8+GJIzd3kOf9pF0J6Bb37wuzh/cQ6teXtaiuWFRFQUpUi5AOZqEEF7hu4/dMdWJWKrKLnnprWLJw51B/eZ2VZgMAfs5YVhlTN5VtYQUTWVZobeXQ3SnJldEngVlwJxI2CxMqyixNYj3KY2KMYLR9jmpqq04CWiYilFlYtfGoczTEw3cd+h45EeM1yv4fjeWyKPj4dLEFEUlapy8bPNovvNZ3sz5qjemTUv1IZtbuJ2fyJKQ2lSLp6J6aa1dnxVvYat40cj14mbyhk9tYH2ouuC4QWDtu0HtdhNowUvEVHpZugHjpyy1o6fvzjXV5140Mz4wF3X40uf2hxpe37Ydn5u9yeiNJRuhh4UfP0VKf46cdtOT9uMuTFcXzLDd90lGrb4yu3+RJSG0i2KRq1EAdqLlxfn5nGhtbDk9qCWAN7vGGSJKG+VXRS1pStWD9Wsj5mZbfUEc2DpDH7/zk1oDNd72gUQEZVF6VIutnQF0DvLduGlcPI4jo3NvogoSaUL6EBw8LX1XbHJq7KEm4uIKGmxUi4i8jEROSUiL4vIWFKD6teOLQ08O7Z9cRt/mKwqS0zb/Mt2shIRFV/fAV1EBgF8BcBvAPgwgM+IyIeTGlgcpjy73+qhWiZ5clvb3aQ3RxERxUm53AjgZVV9BQBE5DEAdwL4uyQGFoc/z76qXoNIuwVv1rlq20w8SpdIIiIXcQJ6A8AbXT+fBvCr8YaTnDwWOU1sM+55VdRrgz2lktxcRET9ipND93etBQw78kVkt4hMisjk2bNnY7xcOdlm3F5pJEsliSgpcWbopwFc0/XzGgBv+u+kqgduZkrzAAAEvUlEQVQBHATaG4tivF6isioZ3HPrBuOmJe/1GMCJKClxAvoPAFwrIusBNAF8GsBnExlVyrIsGeQ2fyLKSt8BXVXnROS3ARwBMAjg66p6MrGRpSioZDCNQMuZOBFlIdbGIlX9DoDvJDSWzLAfORFVUel6uSQhqI85EVFZLcuAzn7kRFRFpezlEhcXKomoikof0PstP+RCJRFVTakDOjsWEhFdUuocOjsWEhFdUuqAzvJDIqJLSh3QWX5IRHRJqQN62coPTQddEBElpdSLomUqP+QCLhGlrdQBHShP+WHW/WOIaPkpdcqlTLiAS0RpY0DPCBdwiShtDOgZKdsCLhGVT+lz6GVRpgVcIionBvQMlWUBl4jKiSkXIqKKYEAnIqoIBnQioopgQCciqggGdCKiihBVze7FRM4CeK3Ph18J4O8THE4Z8D0vD8vtPS+39wvEf88fVNWRsDtlGtDjEJFJVR3NexxZ4nteHpbbe15u7xfI7j0z5UJEVBEM6EREFVGmgH4w7wHkgO95eVhu73m5vV8go/dcmhw6EREFK9MMnYiIApQioIvIx0TklIi8LCJjeY8nbSJyjYgcE5EXReSkiHw+7zFlQUQGRWRaRL6d91iyICLDIvJNEXmp82/9r/IeU9pE5P7Of9M/EpFHRWRl3mNKmoh8XUTOiMiPum67QkS+JyI/7nxdncZrFz6gi8gggK8A+A0AHwbwGRH5cL6jSt0cgC+o6q8AuAnAv18G7xkAPg/gxbwHkaE/BPBXqroRwPWo+HsXkQaA/whgVFU/AmAQwKfzHVUq/hTAx3y3jQF4RlWvBfBM5+fEFT6gA7gRwMuq+oqqXgTwGIA7cx5TqlT1LVX9Yef7n6H9f/RK990VkTUAbgPwJ3mPJQsi8ksA/g2ArwGAql5U1Zl8R5WJFQDqIrICwBCAN3MeT+JU9a8BvO27+U4AD3e+fxjAjjReuwwBvQHgja6fT6Piwa2biKwDsAXA8/mOJHUPAfjPABbyHkhGfhnAWQD/s5Nm+hMRuTzvQaVJVZsA/juA1wG8BeAdVf1uvqPKzAdU9S2gPWED8P40XqQMAV0Mty2L0hwReS+AJwDcp6r/mPd40iIiHwdwRlWn8h5LhlYA+BcAvqqqWwCcR0ofw4uikze+E8B6AFcDuFxEfjPfUVVLGQL6aQDXdP28BhX8mOYnIjW0g/kjqvpk3uNJ2VYAd4jIT9BOqW0XkT/Pd0ipOw3gtKp6n7y+iXaAr7JfA/Cqqp5V1RaAJwH865zHlJWfishVAND5eiaNFylDQP8BgGtFZL2IXIb2IsrhnMeUKhERtHOrL6rql/IeT9pU9QFVXaOq69D+9z2qqpWeuanq/wPwhoh4p4TfDODvchxSFl4HcJOIDHX+G78ZFV8I7nIYwK7O97sAPJXGixT+TFFVnROR3wZwBO1V8a+r6smch5W2rQA+B+AFETneue13VfU7OY6JkvcfADzSmai8AuDf5TyeVKnq8yLyTQA/RLuSaxoV3DUqIo8C+CiAK0XkNIC9AMYBPC4i96J9YbsrldfmTlEiomooQ8qFiIgcMKATEVUEAzoRUUUwoBMRVQQDOhFRRTCgExFVBAM6EVFFMKATEVXE/wen1P8dtCUfOgAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "x = np.linspace(0, 10, 200)\n",
    "y = 4 * x + 2 + np.random.normal(0, 4, 200)\n",
    "\n",
    "x = x.reshape(-1, 1)\n",
    "plt.scatter(x[:,0], y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我不在自己从零开始实现 LinearRegression 而是直接使用 sklearn 这个优秀的机器学习库。我推荐新手直接使用现成的库，去感受机器学习算法，而不要一开始就盯着细节。\n",
    "\n",
    "使用 sklearn 以后，如下代码就能实现线性回归："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None,\n",
       "         normalize=False)"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from sklearn.linear_model import LinearRegression\n",
    "\n",
    "lr = LinearRegression()\n",
    "lr.fit(x, y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "新构造出一些 x ，用上面训练好的模型来预测："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "x_new = np.random.random((50,1)) * 10\n",
    "y_pred = lr.predict(x_new)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.collections.PathCollection at 0x7fbf946bb4a8>"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD8CAYAAABn919SAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJzt3XuUlNWZ7/Hv090FdJNAgzQqjS3kTJaOxKNETDIH4xE0kcQbk4uJIcIYMz1HjudoEkEYT0bMDLHFTKIzRBKS0chMT6ITDSHqaC7gaFgTRxAcxehKRgUpMLbGFkO30pd9/qiq7rq8b9Vbl7du/fusldXd1XXZ1cTn3fXsZz/bnHOIiEjta6j0AEREpDQU0EVE6oQCuohInVBAFxGpEwroIiJ1QgFdRKROKKCLiNQJBXQRkTqhgC4iUieayvli06ZNc7NmzSrnS4qI1LydO3e+6pxry3W/sgb0WbNmsWPHjnK+pIhIzTOzvUHup5SLiEidUEAXEakTCugiInVCAV1EpE4ooIuI1IlAVS5m9iLwJjAEDDrn5pnZVOAuYBbwInCxc+71cIYpIlIdNu+KcvNDz3Ggt58Zrc2sOPcEFs9tr/SwgPxm6Aucc6c65+bFf14F/MI5927gF/GfRUTq1uZdUVbf+xTR3n4cEO3tZ/W9T7F5V7TSQwOKS7lcBNwZ//5OYHHxwxERqV43P/Qc/QNDKbf1Dwxx80PPVWhEqYIGdAf81Mx2mlln/LajnXMHAeJfp4cxQBGRanGgtz+v28st6E7R+c65A2Y2HfiZmT0b9AXiF4BOgI6OjgKGKCJSHWa0NhP1CN4zWpsrMJpMgWbozrkD8a+vAD8C3gf8zsyOBYh/fcXnsRudc/Occ/Pa2nK2IhARqVorzj2B5khjym3NkUZWnHtChUaUKmdAN7OJZvbOxPfAh4GngS3AsvjdlgE/DmuQIiLVYPHcdm782Mm0tzZjQHtrMzd+7OSqqXIJknI5GviRmSXu/8/OuQfN7HHgbjO7HNgHfDK8YYqIVIfFc9urJoCnyxnQnXPPA6d43P4acHYYgxIRkfxpp6iISJ1QQBcRqRNlPeBCRKReVGMLAAV0EZE8JVoAJHaNJloAABUN6kq5iIjkqVpbACigi4jkqVpbACigi4jkyW+rf6VbACigi4jkqVpbAGhRVEQkT4mFT1W5iIjUgWpsAaCALiJShGqqR1dAFxEpULXVoyugi8iYEMZMOls9ugK6iEgIwppJV1s9usoWRaTuhbWzs9rq0RXQRaTuhTWTrrZ6dKVcRKQqBMlxF5oHD+tw52qrR1dAF5GSKGbRMUiOu5g8+IpzT0h5LJRuJl1N9egK6CJStGIXHYNUi+TKg2e7mFTbTDosCugiUrRiy/eC5Lj97pO4eOS6mFTTTDosWhQVkaJlC7bzu7Yye9X9zO/ayuZdUc/7BakW8btPo1lV9iavBM3QRaRofouOBiO3Z0vDBMlx+90nPZgnlLIWPLE+EO3tp9GMIedor8K0jWboIlI0r/I9A1za/fxmzovntnPjx06mvbUZA9pbm7nxYydnpEy87tMeci14Yn0gcWEacrF3lbhAJX/q2LwrGugTSVg0QxeRonktOnrN2MF/5hwkx+13n7AqWMB7fSAheZ2gGvq6KKCLSEmkB9v5XVtDqf32el2ANVv20Ns/AMCESOmSD7lSN4nf+y0Mr9myp2wBXSkXEQlFuXdRvj04PPL9630DGemQQuW6ACV+7xf4e/sHypZ6UUAXkVAEyYuXSli9WiB2YfrEc//Gzlsv4YWbzueFm87nib+7hAv3bEu5QGUL/OWquAmccjGzRmAHEHXOnW9ms4EfAFOBJ4BLnXNHwhmmiNSictV+h9n1cPEzD3PBfbfQODgwctvU/jf52r/eypMfmMXpcxcBscB/9V27QxtHEPnM0K8Cfp30803AN5xz7wZeBy4v5cBERIIqpuuhb2VKdzfMmgWf/WxKME8YNzTI6d/52sjPi+e2M6UlUvA4SiHQDN3MZgLnAWuBL5qZAQuBz8TvciewBtgQwhhFRLLKVcee3GdmcnMEM+jtG2Byc4TDRwYZGEotRWx/4Eec/tVroa8v+wvv25fy/K/3DWSUa5az+2LQlMstwErgnfGfjwJ6nXOD8Z/3A9VTXS8idS+9GdjHT2tn27M9Iz8vOLGNNVv2ZKRBEpUw6d8n9A8Mcdz6v84dzAE6OjLKFR2jNfit8YvHF+7azc0PPRf6RqScAd3Mzgdecc7tNLOzEjd73DV9D0Hi8Z1AJ0BHR0eBwxQRGeVV833PzujIouvmXVFW/MuTDAx7hqWcpvf25L7TuHGwdq3ngmwimL89OFzWuvQgOfT5wIVm9iKxRdCFxGbsrWaWuCDMBA54Pdg5t9E5N885N6+tra0EQxaRWlfsjspsVS2bd0X50t2FB3OAV1q9Y5WL/6+3ZRLcfjssWZK1XLHcPWZyBnTn3Grn3Ezn3Czg08BW59wSYBvwifjdlgE/Dm2UIlI3krfSO7y30OeSq/NiYnt+IZojjbx0zZehpSXl9r6m8Vx1/peYfe19zP0//wxLlgD5L3iGWfFSTB36tcQWSH9LLKf+D6UZkojUs1LUjOfTeTGXSIONVKckHn9140k8/pc38XLrdIYx9k9qY9WiK9kyZ0HG6/ttoKpExUteW/+dcw8DD8e/fx54X+mHJCL1rBQ14/l2XkzWEmlgfKSR3r6BkYMugIyc/Gca3kXkyu/RNzCc8vj0qhW/wzPSn9PrsaWmXi4iUlalON/TL4gmWtze8NBtLHnyQRrdMEPWQPcpi7jlT6/i+gvmeC5Izu/amnExGBh2GXn4KS0Rz+fItoGqnKckKaCLSFmV6nxPvyB6+M//gs/sfmCkFK/JDbN09wMs+5PjYe6HPZ8r6KeDlnFNeQXkcp+SpF4uIlJWYfZ4WTy3nUt2PZhRV20AGzf6Pi7op4NybeEvlGboIjIifbNOWCmCks1cu7vhuutiOzY7OmDtWhqGffLoQ/75da9PDV7KtYW/UAroIgJ4b9Yp9wENftIvNLcMPcPp31gDr702eqe9e6GzE8zAq2yxsTHjpvSWABMiDbzel7l7FGKz/HJt4S+UUi4iAoTbgrYY6XXrp21/gPfccE1qME/o64OJE72fqLMz6/P29g/wVlpFSzJH5S9suSigiwhA3kfGlUv6hWblI5toHnjb/wGHD8MVV4zOyBsbYz/fdlvW54XYBazRvDqb4Ht2aTVRykVE2Lwr6nmoM1Q+b5x+QZlx6NXsD+joiAXvtACe63kTvHaZlrNjYjE0QxcRbn7oOc9gbsCCE9sqepJ9+gXlwKRp/nduaYG1awt6Xj9TWiKhnbRUagroIuI7W3XAPTujRfVdgeKacaVvrV935lL6msZnjJOjjoqVJsZ7rOT7vH7yrT2vJKVcRMR396ZXb5TkhdIgJY7FVs8k7wqN9vaP9FNZ+cgmZhx6lQOTpvHdRZ9nzV1fzeMdZ+429WvnVek1hHyYK6IrWb7mzZvnduzYUbbXE5Fg0oMu5O6Nkv77RA6+PS24z+/a6nmxaG2OsPt6752bfmavut83NfRC13mBn8er3j5xwUjX3trM9lUL8xpnqZnZTufcvFz3U8pFRHx3b/pVdnjN3BOBNj0tk61feLbUi1eappizQ5Of16t974IT2zy7JtbCYmiCArqIALGgvn3VQl7oOo/tqxayeG67b2vYXP3Gk9My2YKtX417mEHXr1xx27M9obUkKBfl0EXEV66uhtlEe/u56oJruO8X36G1/xAAv5/wTm44p3MkD+43e88VdItpT5CtfW8pm2mVq41CMgV0EcnKL8jl6n1yw0O3sTSp6yHAUW+9yc333wLAljkLfGfvYQbdUrTvzaVSbRSUchGpsGLP16yE5Jw7jJ4af+Gebey89RJeuOn8jGCeMN4NxXZ7ZkmVlCJX7scvjVTKXHml2ihohi5SQdXcECuX5Jny5l1Rdnd9k9X338J4l/vUoBmHXs2anw7aM72QtIZfGqmUf+9SnMpUCAV0kQrKNpNLDpblzsXmpbubxdddx+K9ewM/pOH4jqzvIUjQLeZiGPbBE+VI63hRQBepoGwzuc27oqzZsofe/tF2rpWewXu2sf3qtbEuh0FFIinb8/0uWLmCbpCLYaWU6lSmfCmgi1SQ30xucnPEd9GxUkHLa0Y841t/nV8wnzgRvv3tke35frPsHXt/z7Zne7J+KqlUWiOIcqR1vCigi1SQ30zOjKwVJOUOWpt3RfnS3U9m1J8f+0ZPsCc46ii49daMPit+s+zuX+3L2KgEqZ9KKpXWCKrc54mCqlxEKspvh2avz6k5CeUIWo+vXc/LU45m2BqYd9Zcznt6a8Z9snY+hFgg/6d/gldf9Wyala0pWDKvCpFyVKvUGs3QRSrMayaXbeNOOYLWf31yGaf9cNPIjG/moR66HlwPMLIpCGKdD7seXE/LYNKBEy0tgbse+s2yvUR7+5nftTUlxw7lT2tUMzXnEqlCXs2yINab+/oL5hQUtAJVy3R3w1VX4V57zbOGfP+kNs644o6U2y7cs42Vj2xi5puvjhzUHLSFrdf79DtoI6E50lhzW/KLFbQ5l2boIlVqfFPDSKArJpBD9hI/iM1y521/gK6H1tM88LZnMAfv04K2zFnAzvkfLagjodcse8GJbdyzM+q7hlAtlSzVSAFdpERKVS/uNWvNdnhxEF6Ljx/a/XPe9/fLOOaNHuZNmkbzkbeyn9WJd8682BSQV8pp3vFTs6adqqGSpRrlXBQ1swlm9h9m9qSZ7TGzG+K3zzazx8zsN2Z2l5mNC3+4ItXJrztgISf7fOnuJ0u+bTw9AF64ZxtdD65nxhuv0IBj5qEepr71ZtbnGCaWM29tjjClJRJqR8JE50e/9r3VUslSbYLM0N8GFjrn/mBmEeCXZvavwBeBbzjnfmBm3wIuBzaEOFaRqlWKTS6Ji4Jfa9pss9LkTweTmyOYQW/fwMgnhfTFx5WPbEpdyATfNAvEgvljiy7m737ytUDvpVQqtUGnVuUM6C62avqH+I+R+P8csBD4TPz2O4E1KKDLGFWKTS5eF4VkM1qbPdM6kNr50Gtn6cdPa0/JS3vlwiH2H7al/fx68ztZc3ZnLE8e+N2UhipZ8hMoh25mjcBO4I+AbwL/BfQ65wbjd9kP6C8sY1YpNrlkC/7NkUYWnNjmubA5IdKQ9UKQ3kc82tvPgUnTmHkoc1PQ683vpC8yYeSsznVnLh0pU7QK5a0rsUGnVgXaWOScG3LOnQrMBN4H/LHX3bwea2adZrbDzHb09ATcVSZSY0qxycUv+DeacePHTmbbsz2eaZ3XkzYhXbhnG7/ccBnP33QBv9xwGRfu2QaM9hHfvmohRiwX3tc0PuW5+prGs+bsTj64/A7ede1POOOKO1JqzpW3rn557RR1zvUCDwMfAFrNLDHDnwkc8HnMRufcPOfcvLa2tmLGKlK1/HZ85jOz9Lso/O3Fp7B4bnvO9E1ioXPmoZ6Rhc6uB9dz4Z5tKcF4RmszW+YsYNWiK9k/qY1hjP2T2li16Ep+duo5LHl/R8V2YNZib/hqknNjkZm1AQPOuV4zawZ+CtwELAPuSVoU/U/n3G3Znksbi6Qcqr7dbBbZxj6/a6tvGZ8Bj264zDONEp08nce3PZHSjjfXpqVK/A29xjUWNxF5CbqxKEhA/+/EFj0bic3o73bOfcXM3gX8AJgK7AI+65zLWsSqgC5hq+egsHlXlKvv2u37++dvuoAGj8ynM8OGU+vYwwzYhT633wWrvbW5oE1L9aRkO0Wdc/8JzPW4/Xli+XSRqpFP+WCtzeQXz23nhp/sScmZJzSa8daxM2g5mJmisI4Oz+cK470GOXTC7+9eze1wa4W6LUpdCRoUSrURqNyuv2BORn4bYMg5/ur9n2FwQtrCZUtLymESYct1lma2v3uY54iOFdr6L3UlaPlgmKfd5Nrkk36MWj6fEhK/8+pN/sMT/ieDQ45r/u1OZhx6NTZjv/mmwI2ySiHXBdXv7371XbtpbY4QaTQGhkbflzYR5UcBXeqK187CSIPRd2SQ2avuHwmaYX28T085+B0fB2Q/Xu47a2MtaIeGoLEROjvhtljNweK57XzBJ5e++aSz2HzSWUAsGH58YjvburaWLa2U64Ka7e/b2z9ApMGY0hLxvQBKdgroUlfSdxZObo5w+MjgSN45ETRbWyKeuehiP97n2u3ZPzDE6nv/EzDf4+WGrlgOj20ZvXFoCDbEN2HHg3qQPuJBT/4ppVxb9XONe2DY0TKuiV1/9eFQxlfvlEOXupPYQPNC13lMHN+U8hEeYoHOOUKptQ4yw+8fGM4a9C/6j/u8f7Fx48i3XjXrXoKc/FNKuerxg4xbi6CF0wxd6ppfcHijf4BvfOpUz74o84tIUeRzAg+MHg6RvNW+0fm0yh0avQikfxJpMPNt6pUucfJPWGmYbBU0yeP2+ztpEbRwOrFI6lo+tc1Ba9izLWT6bdrxktjZmdz1sD8ynvEDRzzryWlshMHBzNt9Xtfv5J/02ytVp1/PewZKLWgdulIuUtfy6bGSq+QOcpc7pqccLEtPWq8Wts0Db9PwjoneD+js9H0ur1THkg9kbuH3CvJhp2H8lKJdgqRSykXqWj7tV4NUvgQpd0xOOWSbsfu1sOXwYbjiCt8ql2zv1e/kn8R7r7YTgNRJsbQU0KXuBQ0aQWrY8y13zJYz9mthS0dHLHjnCOBBpL93vxSU8tb1QSkXkbgg6ZnAuxm7u2HWLGhoYPGfzmf7cQe55VOnpjz/ujOX0h9JbWHbHxnPVSd/IrROg6Vo8yvVSwFdJC5ITjdQQOzujqVI9u4F52JfOztpf+BHjG8a/U/u0dM/zNPXfw2OPx5nRnTydK4990p+PGdBaK0IlLeub6pyEcmTZ5XLMw/DddfBvn3Q0JBSYpgQnTyd+f/r9pGfkys6arXTYK01OKtVJeu2KCKpMnLyiRl5X1/sZ49gDnDsG6n58uTF1FrsNBiks6KUl1IuIsW67rrRYJ7FgUnTMm+LB+xa7DQYpMxTykszdKlpiY/80d5+GuO7JdvLfWDDvn05n6OvaTzrzlyacXsiYOfqgVLw2EJUi58q6p1m6FKzkjf5ACNb30u5oJh4jdO2P8CjGy7j0dXncPqC9/L42vWjd/I4QAJg0BpSzuv8SdKBy5AasAtZrKx0T/da/FRR7zRDl5qVrbNhrt7mQWe2Nz/0HB/a/fOULfrtb7zC1BuugVlTYr3G165NzaETm5GvWnQlW9KCeHtrs+9r5rvJJlfKI+yZe6GfKiQ8CuhSVfJJIeT6aO/3+5yLed3dIxUrd71zGs1H3vLcov/ylV/kVyedxeLEARLxx7w8uY2vnnGpZzAvZcWK3/tLvJ+wFyvz2YUr5aGALhWTHrwXnNjGPTujgQNRrs6Gfh/9s27ff+bhlNn2zEM9ng2uAKb39oyOb8mSkZOBfrUrys/ufQpCnrn6vf9Gy+y1XqrTmNJp6351UQ5dKsIr/9v9q315VU1k662dLYBmXczzqFjx6691YNI0z/GVa/OO3yYnvza6Wqysf5qhS0Xc8JM9GcHbbyYctE9KcifBCRH/uUrWni0+FSuO1MCeXLUS7e1POd4uMWsNe+bql/Lw6zWuxcr6p4AuZbd5V9Tz+Dc/2QJRInCm58Vf7xvISNcklzh69QRfce4J8K2O2Fb9NHbUUbw81Mj03p6RgyiSc+TJVSbJrxk2vwuHFivHJgV0CaSU9c7ZNp74BtocY/I6sSc5b5we8BMzbgepdes+FSvrzu5k8JJLUnL8XsLKVedDi5VjlwK65FTqLd7ZcrlLPtDBtmd7cgai9DHlyht7LYQmgnlK5Ul8YbNvxbVMOHhgdDY+ez7NO6N8/LT2kfHlmyIqJy1Wjk0K6JJTkEMd8uGXw25tjvA3i08ueEx+rwV57mpcsoQPvXRsxhj7B4bY9mzPyAVAvcWl2qjKRXIq9RZvv+qMNRfOKXpM6c+ZSNfku6sxyHtWb3GpNpqhS05BTvJJly3nXoocb7Ya7GHnMp4z312NQd5zOXPValMrQeTsh25mxwGbgGOAYWCjc+5WM5sK3AXMAl4ELnbOvZ7tudQPvTblezp7OU5z37wryi+/citXb/0eMw69yoFJ07hl4Z9xxl9dVfR2/3K9h6CKHYsuBrUvaD/0IAH9WOBY59wTZvZOYCewGPgz4PfOuS4zWwVMcc5dm+25FNBrVz5BoSyHNXR3M/j5P6fprdHXGZzQTNN3vzOysFms5Pc8uTmCGfT2DZQ9KBbz96ymC5MUrmQB3eOJfwysj//vLOfcwXjQf9g5lzV5qIA+Nsxedb9nBYgBL3Sdl3F7QTPIWbM868U5/nh48cUCRu2v0kEx379nslo9CUlSBQ3oeS2KmtksYC7wGHC0c+4gQPzrdJ/HdJrZDjPb0dPjccK51J18FiALaQG7eVeU4b0+PcgD9CbPV6UPciimTa16lo8tgQO6mb0DuAe42jl3KOjjnHMbnXPznHPz2traChmj1Jh8qj+yBsvly6GpCcxiX5cvH7kAeJ3+A4z0Jt+8K8r8rq3MWnU//231A8xadT/zu7YW1Cu80kGxmGoa9SwfWwIFdDOLEAvm3c65e+M3/y6eaknk2V8JZ4hSa/JpTuUXFP/irr+FDRtGz+ccGsJt2MAbn+ukf2CIdWcupa9pfOqDWlpg7dq8D75IBP/ZPkG/0kGxmGZfKq0cW4IsihpwJ7EF0KuTbr8ZeC1pUXSqc25ltudSDl3S+eV4f7vuQprccMbtg9bAH63cAsCFe7ax8pFNI1UuM2/7BixZ4vucCcn54yD5cb/7JO8arebqEVW51L6gOfQgdejzgUuBp8xsd/y2vwS6gLvN7HJgH/DJQgcrY5dffXijRzAHUm7fMmfBSIOs9tZmti+JBel8Dr4IsgvWq948397tlaQ2AGNHzoDunPsl/i2hzy7tcKQUqnlGljy2ZS9sZ+Wjm7jo5QP8bnIbN55xKTvmf5QV556A3dg4mm5JMmSZWcL0FEI+B18EzY+nB8X5XVvLdoiESFDa+l9nKn1wcNCxXbBnGyvv/TotB6OYcxzT+wq3bt3A9uMOxgJiZ2fG4x3QfcqilNu88sn5HHxRaH680gulIl609b8GZZuBl7qRVj6vnUvy2FY+sinjnE76+mInBi1ZArfdBsDwtzdiw0MMWQPdpyzi+nOXAxBpNG7+xCmer51+8EVjvLVuu8d4/VI+C05sY37XVt/3WUg7BJGw5b2xqBhaFC1erkW8YjahFPvavuKHLg/v3TfSjvaW+75Og9dIzWA4NX++eVeUNVv20NsfOxRjSkuE6y+YU9I2AunB36sve5CFUu3AlDCEtlO0GAroxcu188/v935Nq0r52p66uz0PjHgrMo6p/W9m3j+EnZ5BeAXodOnvs5rXKqS+lLLKRQIox3/cm3dFfRf7ErlbrxQCZNZiQ/7VGAXljT0OXW4ZfJv+pnH0NY1PTbvE68grIUh/9VwLpSKVpoBeAqU+0Sfba/hJ5G7TS+xyHc2W6zWTL1KtLRHPs0ATr+15UfPZij/lrT/wlYtXsfLRTbS8fIC+Y2aw7oNLufOpVmZ0bWXBiW1lrfEOspip/LhUO6VcSqAcDZCybZbJlrstNKfulYKINBgYDAyNPmPitcH7YOKd//B5Wg56VNgkpVaCpDuSXyeMT0K5NiMpPy6VFEpzLvHmN7uL9vYX3D8k6GsAI4HOa/t6oWV5XimIgWHHxHFNnlvQ/apr1n1waSyVkiwttRIk3dE/MMSaLXtCK8n0KnVMbL7IZ6u9SCUp5VIC2Tay5Jt+8cvF+6U72uOB2S/lk+9JPQl+F5A3+gfY/Uc98K3rYp0Nv9UBa9dyoLfV8/53zp7Pmo1zYrn0fftizbPWrk3pWR60djtR5ZKsVCWZ5Tx9SCQsCugl4LcQmZBPztorMO/Y+3v+8NZgxv0jjcaKc0/IWnueSPnkG6j8LlLLXtgOf3/r6ELn3r3Q2cmy86/ie7PnZ9x/cnOE+S8dy4FPf9P3tXPt7MylVJt5tMgptU4plxJI7obnJ0jQ8QvM33/sJQaGMzPhE8c1sXhue87qk8Vz29m+aiEvdJ3H9lULi+rSt/LRTRlVK/T1sfLRTRn3jzQYh48M5kyRZNvZGYQWK0ViFNBLJBE0/YJ6MYcRpFepJLwRT0GUvL3r8uUsPv14nvmbj/DbdRfylYduG8kjt7x8wPMhLS8fyGjx+o4JTSkLqOB9MIRXe9igqqUVbK4WvCLloJRLiRWas4bsJ9l7BfUZrc1s3hXl8NuZ6ZiCA93y5bE+5MQWBZvcMEt3P8DSPzke5t4Wy4F7Hf3W0ZGRspi96n7Pl/C6cHk1v/L6W7Q2R5g4vqmq8tzlKFsVCUIz9BIL4zCCS95/nOftC05sY/W9T2UsFk5pieRdlfH42vW8POVoXDyYZ9i4MfZ17dqcVSsJxXxy8PtbrLlwTt7po7BV+og6kQTN0ENQ6OJatkqLecdPzbjdr9yvJZ5bD+rxtet5zw3X0Dzwtv+dEq1sE9UpWapWErw+rRiw4MTcRxHWUtWJOi9KtdDGIg9hbuMv5XP7bRoCPDsLAiONspKD8ctXfpFjenOcINjYCIOZqZ1c/t/mp+j+1b6sja7KKYx/23JsLJOxTb1cChRmPjTf584VfPKuf09vlBUvOZyeXrXixaM/uZ/kcTeYZVx0KnUQRFj/tsWsm4iUkmboacKcbeXz3IWedZkueRHx37/9Oc+Z+KA1eJ7f6QBrbIwF83h/8vQxpl9wILMFgJdStPPNV5j/tuq8KGHSDL1AYeZD83nufM+69ApU6YcoH32ox/P1G91wRufD/sh4nr7+a5x+3ZWej/Gb7U6INOQM5lCZ2vEw/221KUmqgQJ6mjBPosn23OkzvFxtchMSgSQx+xwN4rHgnShjmnmoh2G8D4eNTmpj3ZlLufaRTRx76FVeaW3jpWu+7BvMwf+CEySYVyodoVOGpN6pbDGNX7lcKQKQ33Mnyg+Td1T6ncrtF3xWnHsCn3ju3+h6cD0zD/XQQOY/bgOQnljpaxrPujOXsmXOAs644g4a3DDqUG5IAAAO4UlEQVTHvP67rME8W192P41meZdxllqY/7Yi1UAz9DRBy+WC5kzT7/fx09oz+nx7zXYdeB6D5hd8Fs9t58OP/XPmOZ0eopPaODaehkkEcwg2U83Vlx2fcVdDt8JaKoUUKUTVB/RiF5sKeXyufGjQagmv+92zM5oR3L5w127P13HEFjUTG4cmRLJ/oPLblp/s4OTpbPjez7lnZzRQVUb636/vyGDOtEryxci3fLJClOuWelbVKZdEQCy0/3WuxxfafyPozsCg9/ObGbc2R3h7cDRJ8nrfAKvvfYrH166HWbOgoSH2tbs7doeOjqzj7msaz4EVX+ZvFp8caDer19/Pq4Wvl0Qwr5bdnCJjQVXP0INUehT6ePDvIZ7ruYNWSwS9n9+OSq/+3x/a/XPe89B6SOzqjNeSA7Edm2kHMicWQqOT2vjuos+zJp4bzzZTTczKi2lpC9opKVJuVR3Qiy0zy/b4IBcLv3RN0GqJoPdLLz9Mz0EnW/nIpswt+n19sd2f8SPd+lZcy4SDB1Jy5MlHuGUTpLY9qPT3qVptkXBVdcql2Law2R6f62KRLV0TtFoin6qK5Pa72bZ6zTj0qvcvEocxL1lCy4H9bHniJT61+gf8ZM6CvCpLghwH19ocydniNv19Fps+E5HccgZ0M7vdzF4xs6eTbptqZj8zs9/Ev04JY3DFlplle3yui0WuGXzygRaNZiO/Sw5QhXRezPXp4+Bkn8ZWafnzQg61CPL6yR0P/UorgYz3qY6EIuELknL5HrAe2JR02yrgF865LjNbFf/52lIPrtgys1yPz9Z/I8gpQOnP4ZWHz7eqItumovbWZg6s+DLtX7029dQgn/a1XnKlPfzOLk28fvL9/cba3tqc8Z7VkVAkfDkDunPuETOblXbzRcBZ8e/vBB4mhIAOxZeZ+T0+V7APkv8udtHWi1+jp9EZ70KYNSVQ+9p0ucotN++K+p5devMnTsl4T/k0pdIuTZHwBWrOFQ/o9znn3hP/udc515r0+9edc55pFzPrBDoBOjo6TtvrddpNFQrSHKug9rUQKzO86ip47bXYz0cdBbfeOhKUC920lOvTy9yv/NRz9p1o4OX3yaC1OcLu6z/s+bt8xprr7yki3qqmOZdzbiOwEWLdFsN+vVIJku7Ju30txIL5ZZfBQFJgfe01+NznYt8vWRLoU0khrXj9Uim9/QOeJZIJb2T5XdBPUNqlKRK+QmfozwFnOecOmtmxwMPOuZwrlbXQPjcfQUr82lub2X7cwdQZuZ/jjx8pPcwl31awfvcPQgc1iFRW0Bl6oWWLW4Bl8e+XAT8u8HlqWnq1i5d52x+Izb5zBXMYLT0MIN9FxkIXH9W8SqR2BClb/D7w78AJZrbfzC4HuoAPmdlvgA/Ffx6TkuvHvaz+5T/CkSPBnizH1v1k+dbo+91uWWoPK9kZUUTyF6TK5RKfX51d4rHUtBXnnsAvv3IrV2/93siBErcs/DOOfsP7UIkM48YFLj1MvF4+x5553T/SYEQajb6B1Ka6WqwUqU1VvfW/lix+5mHOf3A9TW/FUhszD/XQ9eB6bOrU3OmWtCqXQK+X5yJj+v0nN0c4fGQwI5hPaYlw/QVzFMxFapDOFC2AZ6nen86PNcpKd9RR8OabmWmXSATuuCOvIF5KOqlepHaEvSg6Zj2+dj2nL3gvj64+h0c3XMZp2x9g9b1P4fwWNH//e7j99lhgTzjqqIoGc9DOTZF6pJRLPrq7ec8N14x0O0ykVQB+N7mNY3pfyXxMR0cscFcweHvRzk2R+qMZej6uuy6jdW3L4NusfGQTN55xaaynSsovg/dYKTedrylSfxTQ8+GTVplx6FV2zP8obNwY2xxkFvu6cWPgHiuFnJxUjEI6QYpIddOiaLrly+Hb34bhePXHxImxn5csiR335rHwGZ08nce3PVFQMFSPExHJpWp6uZRKPo2oCj4ZZ/ly2LAh9bbDh2FZfFOsxxFv/ZHYOZ2FBt8wOjaKyNhUEwE9n0ZU+TatSrFxo/ftQ0MpR7wlt65tXruW04tY8FS1iYiUSk3k0PM57Sbwfbu7YymUhobY1+7uWOD2k3TEGy++GEvJvPhi0dUrxR6zJyKSUBMBPZ9ZbKD7Ll8Ol14ay4c7B3v30n/Z5Qxla2ySR5+VfKjaRERKpSZSLn4105ObI0BqzrzBjCGPhd7EjPfxtes5bcOGjCtZ88DbvNk0nncMvp15VmZjY2jlh+oTLiKlUhNVLpt3RVnxL08yMJw61kij8anTj+OendGsPckTVSMApy94L+1veGwAAoYx/vHUj/DZ3f9KAy4W2JOqXApebBURKULQKpeaCOjgf3xao8+MvNGMYedSAu/8rq08uvocGnwOjts/qY0zrrgDAANe6Dpv5HcqLxSRSqm7ssVen+PTvII5wLBzKQEZYnn0A5OmMfNQZkvbYWDdmUtHfk6kaBKzcq+Uj8oLRaSa1MSiKPhXfTT6LGR63X9GazPrzlxKX9P4lNuHgX889aNsmbMAiM3Oo739nHrDT1nxwyezHt2m8kIRqRY1M0P3OqDBiM3QDVKSKH5VIivOPYHVh2NtbFc+sokZh17l4OQ2tl/2BTYe/T7o7U95rmwHJyf4XWhy5duVjxeRUquZgJ5cDRJNC7wORn5uzxIcR55j4jg+OGfBSCC9eG47F5P/QcqRRvO8cOTa3FTU5icRER81E9AhFuwWP/MwL9/0Rab39nBg0jTWnbmULXMWjATzXIczLJ7b7hk0N++K5hXMASaOa/J8rlzb+bXdX0TCUFMBne5u6OzkmHgvleR+5FvmLCg4n52YMefrDZ+UTK7NTdruLyJhqJlFUSDWQyWpMRaM9iOH2EajQtrQes2YEyINRoPPBtJ8t+0nbtd2fxEJQ20F9Cz9yAEOHxkk2tuPYzQvHSSoZ5sZ3/zJU/j6xafmtT0/13Z+bfcXkTDUVsqlo8OzH/mBSdMAGBhKrUlPz0v7VZb4tRZob21OyWkHrUrJtZ1f2/1FJAw1s1MUgO5u+i+7POUYuL6m8axadOVIDbmX1uYIRwaH6BsYTrk9uSWAdoGKSLUKulO0tlIuS5bw9PVfIzp5OsMY+ye1sWrRlfzs1HOY0hLxfVhv/0BGMIfUGbyOYxORWldbM/Q4r9QJZM6yg0jv2VJO2lwkIkHUXS+XZH615IBv3xU/laos0eYiESm1olIuZrbIzJ4zs9+a2apSDapQi+e2s33VQtoDBulyVZZs3hXNKKfM5xQmEZEgCg7oZtYIfBP4CHAScImZnVSqgRXDqyww3ZSWSFny5ImZeHo5pd+nCG0uEpFCFZNyeR/wW+fc8wBm9gPgIuCZUgysGOllgZObI5jFWvCWO1ftNxP36+OuzUUiUqhiAno78FLSz/uB9xc3nNLJlmcvJ78Z95BzNEcaM0oltblIRApVTA7da0N8xpTTzDrNbIeZ7ejpyTxYot75zbgTpZEqlRSRUilmhr4fOC7p55nAgfQ7Oec2AhshVrZYxOuVVLlKBr36uCdm4tXyKUJE6kMxAf1x4N1mNhuIAp8GPlOSUYWsnCWD2uYvIuVScEB3zg2a2ZXAQ0AjcLtzbk/JRhaicvcj10xcRMqhqI1FzrkHgAdKNJayUT9yEalHtdXLpUTUj1xE6tGYDOjqRy4i9agme7kUSwuVIlKPaj6gF1p+qIVKEak3NR3Q1bFQRGRUTefQ1bFQRGRUTQd0lR+KiIyq6YCu8kMRkVE1HdBrrfzQ66ALEZFSqelF0VoqP9QCroiEraYDOtRO+WG5+8eIyNhT0ymXWqIFXBEJmwJ6mWgBV0TCpoBeJrW2gCsitafmc+i1opYWcEWkNimgl1GtLOCKSG1SykVEpE4ooIuI1AkFdBGROqGALiJSJxTQRUTqhDnnyvdiZj3A3gIfPg14tYTDqQV6z2PDWHvPY+39QvHv+XjnXFuuO5U1oBfDzHY45+ZVehzlpPc8Noy19zzW3i+U7z0r5SIiUicU0EVE6kQtBfSNlR5ABeg9jw1j7T2PtfcLZXrPNZNDFxGR7Gpphi4iIlnUREA3s0Vm9pyZ/dbMVlV6PGEzs+PMbJuZ/drM9pjZVZUeUzmYWaOZ7TKz+yo9lnIws1Yz+6GZPRv/t/6TSo8pbGb2hfj/p582s++b2YRKj6nUzOx2M3vFzJ5Oum2qmf3MzH4T/zoljNeu+oBuZo3AN4GPACcBl5jZSZUdVegGgS855/4Y+ADwv8fAewa4Cvh1pQdRRrcCDzrnTgROoc7fu5m1A/8XmOecew/QCHy6sqMKxfeARWm3rQJ+4Zx7N/CL+M8lV/UBHXgf8Fvn3PPOuSPAD4CLKjymUDnnDjrnnoh//yax/9Druu+umc0EzgO+W+mxlIOZTQLOBP4BwDl3xDnXW9lRlUUT0GxmTUALcKDC4yk559wjwO/Tbr4IuDP+/Z3A4jBeuxYCejvwUtLP+6nz4JbMzGYBc4HHKjuS0N0CrASGKz2QMnkX0APcEU8zfdfMJlZ6UGFyzkWBrwH7gIPAG865n1Z2VGVztHPuIMQmbMD0MF6kFgK6edw2JkpzzOwdwD3A1c65Q5UeT1jM7HzgFefczkqPpYyagPcCG5xzc4HDhPQxvFrE88YXAbOBGcBEM/tsZUdVX2ohoO8Hjkv6eSZ1+DEtnZlFiAXzbufcvZUeT8jmAxea2YvEUmoLzeyfKjuk0O0H9jvnEp+8fkgswNezc4AXnHM9zrkB4F7gf1R4TOXyOzM7FiD+9ZUwXqQWAvrjwLvNbLaZjSO2iLKlwmMKlZkZsdzqr51zX6/0eMLmnFvtnJvpnJtF7N93q3OurmduzrmXgZfMLHFK+NnAMxUcUjnsAz5gZi3x/4+fTZ0vBCfZAiyLf78M+HEYL1L1Z4o65wbN7ErgIWKr4rc75/ZUeFhhmw9cCjxlZrvjt/2lc+6BCo5JSu//AN3xicrzwGUVHk+onHOPmdkPgSeIVXLtog53jZrZ94GzgGlmth+4HugC7jazy4ld2D4Zymtrp6iISH2ohZSLiIgEoIAuIlInFNBFROqEArqISJ1QQBcRqRMK6CIidUIBXUSkTiigi4jUif8P+Gid/WRBpkQAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.scatter(x[:,0], y)\n",
    "plt.scatter(x_new[:,0], y_pred, c='red')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "新来的样本落于一条直线上，这条直线就是回归模型学到的，观察模型参数，和产生数据的模型很接近。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(array([4.00044564]), 2.239418872523075)"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lr.coef_, lr.intercept_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 误差"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "上面的例子中，训练样本和训练得到的模型之间存在误差，我们可以使用使用下面代码计算一下："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4.238984148444511"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from sklearn.metrics import mean_squared_error\n",
    "\n",
    "y_pred = lr.predict(x)\n",
    "\n",
    "mse = mean_squared_error(y, y_pred)\n",
    "rmse = np.sqrt(mse)\n",
    "rmse"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "这里得到的均方根误差和前面加入的高斯噪声的方差很接近，其实理论上就应该是 4 的。\n",
    "\n",
    "再回归模型中，均方根误差有着重要的性质，均方根误差其实就是误差的标准差。如果误差呈高斯分布，那么这里的均方根误差就是误差的标准差，即高斯分布中的 $\\sigma$。3$\\sigma$ 的理论说，高斯分布中的样本距离均值距离小于 $1\\sigma$ 的概率为 68%，小于 $2\\sigma$ 的概率为 85%，小于 $3\\sigma$ 的概率为 99.7%。\n",
    "\n",
    "因此，有了模型的均方根误差，就能够估计当前模型预测出的结果与实际值的差异。比如均方根误差为 4，那就说明有 68% 的可能性，误差小于 4，有 95% 的可能性误差小于 8，以此类推。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 使用正则化\n",
    "\n",
    "如果要想使用带有正则化的线性回归，可以使用 `Lasso` 和 `Ridge`。这里我使用 `make_regression` 生成了 10000 个样本，每个样本有 10 个特征，其中只有 5 个有意义。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.datasets import make_regression\n",
    "\n",
    "x, y = make_regression(n_samples=10000, n_features=10, n_informative=5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### L1 正则化\n",
    "\n",
    "要想使用 L1 正则化，可以使用 `Lasso` 类："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([80.17853949,  2.20934648, 12.11783159, -0.        ,  0.        ,\n",
       "       39.47341535,  0.        , -0.        , -0.        , 37.02262417])"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from sklearn.linear_model import Lasso\n",
    "\n",
    "lasso = Lasso(alpha=1)\n",
    "lasso.fit(x, y)\n",
    "lasso.coef_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### L2 正则化\n",
    "\n",
    "要想使用 L2 正则化，可以使用 `Ridge` 类，L2 正则化的线性回归也叫作岭回归。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 8.11755063e+01,  3.16308209e+00,  1.31036902e+01, -1.63406892e-04,\n",
       "        1.30984317e-04,  4.04597611e+01,  8.24451723e-05, -6.75223824e-05,\n",
       "       -1.83452397e-05,  3.80118322e+01])"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from sklearn.linear_model import Ridge\n",
    "\n",
    "ridge = Ridge(alpha=1)\n",
    "ridge.fit(x, y)\n",
    "ridge.coef_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### L1 + L2\n",
    "\n",
    "如果想要同时使用 L1 和 L2 正则化，可以使用 `ElasticNet`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 5.35787131e+01,  1.84979079e+00,  8.48145028e+00, -2.28297205e-02,\n",
       "        0.00000000e+00,  2.67749729e+01,  0.00000000e+00, -0.00000000e+00,\n",
       "       -0.00000000e+00,  2.50762698e+01])"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from sklearn.linear_model import ElasticNet\n",
    "\n",
    "elastic = ElasticNet(l1_ratio=0.5)\n",
    "elastic.fit(x, y)\n",
    "elastic.coef_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "可以看到使用 L1 正则化确实得到了稀疏解，使用 L2 得到的参数都比较小。\n",
    "\n",
    "\n",
    "## 推荐阅读\n",
    "\n",
    "关于线性回归理论，推荐看李宏毅老师[机器学习课程](http://speech.ee.ntu.edu.tw/~tlkagk/courses_ML16.html)关于回归和梯度的章节。\n",
    "\n",
    "另外建议花时间尝试 sklearn 这个机器学习库，skleran 的[用户手册](https://scikit-learn.org/stable/user_guide.html)是非常不错的资源，你可以尝试其中的例子，并尝试弄清楚其中的原理。\n",
    "\n",
    "了解常用的机器学习工具是很有必要的，学习如何使用 jupyter notebook，以及了解机器学习中常用的数据处理库，numpy 和 pandas，matplotlib 等。学习这些内容我推荐阅读 [Python Data Science Handbook](https://jakevdp.github.io/PythonDataScienceHandbook/)。\n",
    "\n",
    "[Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow](https://book.douban.com/subject/30310982/) 是目前最好的机器学习实战书籍，强烈建议阅读。"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
